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Abstract. We investigate the optimal estimation of a quantum process that can 
possibly consist of multiple time steps. The estimation is implemented by a quantum 
network that interacts with the process by sending an input and processing the output 
at each time step. We formulate the search of the optimal network as a semidefinite 
program and use duality theory to give an alternative expression for the maximum 
payoff achieved by estimation. Combining this formulation with a technique devised 
by Mittal and Szegedy we prove a general product rule for the joint estimation of 
independent processes, stating that the optimal joint estimation can achieved by 
estimating each process independently, whenever the figure of merit is of a product 
form. We illustrate the result in several examples and exhibit counterexamples showing 
that the optimal joint network may not be the product of the optimal individual 
networks if the processes are not independent or if the figure of merit is not of the 
product form. In particular, we show that entanglement can reduce by a factor K the 
variance in the estimation of the sum of K independent phase shifts. 
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1. Introduction 

Quantum theory offers impressive advantages over classical theory in the estimation of 
physical parameters [H |2j El IU El [7J [TOj HH [121 H3] ■ The prototypical example is the 
estimation of an unknown phase shift [HI HI [TTJ [12] : here the variance vanishes as N~ 2 
with the number N of accesses to the phase-shifting process, whereas a classical statistics 
over independent copies would give the scaling N^ 1 . The quadratic improvement is 
achieved by preparing an entangled state of iV systems and applying the unknown 
process to each system. The same quadratic advantage can be found in the estimation 
of a direction in space [51 E] and in the joint estimation of three Cartesian axes El E]- 

Given the usefulness of entanglement in the estimation of a single parameter from 
multiple accesses to a physical process, it is natural to ask whether entanglement can 
improve the estimation of many parameters corresponding to different processes. For 
example, one may wonder whether entanglement can help in the estimation of two 
independent phase shifts. In a slightly different context, this type of question was 
originally addressed by Wootters in an unpublished work and by DiVincenzo, Terhal, 
and Leung [14J, who asked whether a joint entangled measurement can improve the 
extraction of information about two bits encoded in two independent sets of states. In 
this scenario, it was shown that that the amount of information that can be extracted 
from the product set is additive [TJ]. More recently, a different proof showing the 
optimality of product measurements for the extraction of information from general 
product sets of states was provided in Ref. [T5] . 

In this paper we address the problem of the joint estimation of the parameters 
encoded in a set of independent processes, where each process can consist of several 
time steps. Due to the possibility of connecting an input of an unknown process with 
the output of another one, here the question whether quantum correlations can improve 
the estimation is not only a question about the usefulness of entanglement in the input 
states and in the measurements, but also a question about the usefulness of quantum 
correlations in time, namely correlations mediated by the exchange of quantum systems 
from one time step to the next. We address the question in the framework of quantum 
estimation [161 E] > where the figure of merit is the expected payoff associated to a payoff 
function g(x,x), which depends of the true value x and of estimated value x labelling 
the unknown process. In this context we prove a general product theorem, showing that 
the optimal joint estimation of a set of independent parameters x := (xi, . . . ,xk) can 
be achieved by estimating each parameter independently whenever the figure of merit 
if of the product form g(x, x) = Ylk=i 9k(xk, %k), where gu is the payoff function for the 
parameter Xk- In particular, our result implies that the maximum probability of success 
in identifying a set of unknown processes is the product of the maximum probabilities 
of success in identifying each individual process separately. 

Product theorems are a key tool in theoretical computer science [HI [191 I2Q1 (2TJ 
l22l l23l [24] , where one is often interested in how the resources needed to solve several 
independent problems jointly are related to the resources needed to solve each problem 
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individually. Our work begins to explore the usefulness of this techniques in the domain 
of physics, starting from the fundamental problem of identifying a set of independent 
physical parameters. In order to prove our result we use the framework of quantum 
combs [251 12S] (see also the work by Gutoski and Watrous on quantum strategies [27]). 
In this framework we formulate the maximization of the expected payoff as a semidefinite 
program, and present an intuitive formulation of the dual minimization program. Such 
a dual formulation is interesting in its own right, as it generalizes to arbitrary processes 
and arbitrary payoff functions a classic formula derived by Yuen, Kennedy, and Lax [28] 
for the minimum error state discrimination. Exploiting the form of the primal and dual 
programs, we then prove our product theorem following a general technique devised by 
Mittal and Szegedy in Ref. [23] (see also Ref. [21]), which is adapted here in order to 
deal with the optimization of quantum networks consisting of multiple time steps. 



2. Quantum networks for process estimation 



Suppose that an experimenter has access to a physical process V x that depends on an 
unknown parameter x in some parameter space X. The goal of the experimenter is to 
determine the parameter x with the maximum precision allowed by the laws of quantum 
mechanics. 

Generally, the process V x can consist of N time steps, labelled by an index s in 
some finite set S = (s\,...Sn) C N, ordered so that s m < s n for m < n. At each 
time step s 6 S the process transforms an input quantum system, with Hilbert space 
denoted by ?4n , a (possibly different) output quantum system, with Hilbert space 

(s) 

denoted by T-L^. If the process V x is memoryless, all time steps are independent and 
one can associate a quantum channel to each time step. The quantum channel at step 
s, denoted by C x , will be a completely positive trace-preserving map sending density 
matrices on Ttf^ to density matrices on Hence, the process V x can be described 

by a time-ordered sequence of quantum channels, each channel labelled by the unknown 
parameter x, as in the following picture: 
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In the easiest case, one may have the same channel at each time step, namely C x = C x 
for every s G S. This is the case, e.g. of quantum phase estimation [3j HI [101 ttB fl2| [13] . 
where one has access to N uses of the unitary channel C x = U x pU}., with U x = exp(ixH) 
for some Hamiltonian H with integer spectrum. 

In the presence of memory, the input-output transformation at the step s is 
described by a quantum channel involving internal ancillas: in this case the quantum 
channel C x ^ transforms density matrices on ®A s -i to density matrices on %^ t ®A s , 
where A s is the Hilbert space of the s-th ancilla. Hence, the process C x is represented 
by a time-ordered sequence of black boxes with internal memories: 
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Note that, since the ancillas are internal to the network, the first and last ancillary 
systems are trivial Aq — An — C. 

The most general strategy to estimate an unknown parameter from a time-ordered 
sequence of black boxes consists in inserting the black boxes in a quantum network 
where the black boxes are interspersed with known quantum gates and eventually a 
quantum measurement is performed on the output, producing the estimate x G X. 

The estimation process can be depicted as 
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where B s , s G S are the internal ancillas of the estimating network, \1/ is a quantum 

fsi ) 

state on B Sl ® H in , each U s is a quantum channel, and Px is a quantum measurement, 
described by a positive operator valued measure (POVM) on the Hilbert space B SN ® 

<tj( s n) 
it-out ■ 

Examples of quantum networks for the estimation of unknown parameters can be 
found in Refs. [HUE]. 



3. Optimizing quantum networks: the method of quantum combs 

A convenient way to optimize quantum networks is the method of quantum combs 
[25| 126] (see also the work by Gutoski and Watrous [27] ). Here we briefly summarize 
some known basic facts about this method, referring the reader to the original papers 
for the proofs and for further details. 

In the following we will use the following notation: Lin('H) will denote the set of 
linear operators on a (finite-dimensional) Hilbert space "H, Lin + ("H) will denote the set 
of positive operators on H, while St("H) will denote the set of density matrices on H, 
that is the set of positive operators p G Lin + (H) such that Tr[p] = 1. 



3.1. Quantum combs. 



A network of quantum channels with internal memories can be associated with a non- 
negative operator satisfying suitable linear constraints. Precisely, a network of the form 
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The fact 



is associated to a positive operator R G Lin_ 
that the network consists of quantum channels (trace-preserving maps) imposes 
the following constraint: there must exist a set of positive operators G 
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N — 1 such that 
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where Tr out s and Jj njS denote the partial trace over T-L^t an d the identity operator on 

Most importantly, the converse also holds [27J [251 I2S]: if a positive operator 
i? satisfies the constraints of Eq. (JHJ) for some set of positive operators R^ n \n = 
1, . . . , N — 1, then there exists a network of the form of Eq. (jSJ) such that the operator 
associated to that network is R. This is important because it implies that optimizing 
over quantum networks is completely equivalent to optimizing over positive operators 
R satisfying Eq. 0. In fact, given an operator R satisfying there is a constructive 
algorithm to build up the channels at all time steps s 6 S [29]. In the following, a 
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satisfying Eq. (EJ) for some operators 
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AT — 1 will be called quantum comb. We will denote the set of quantum 



combs as Comb 
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5.1?. Quantum testers. 

More generally, a quantum network can contain measurements: at each time step s one 
can have a measurement with outcome m s in some set M s . Conditionally to the outcome 
m s , the input system will undergo a random transformation, represented by a completely 
positive trace non-increasing map Cm] , with the condition that the sum over all outcomes 



r( s ) ■= v 

the network 
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Cm] is trace-preserving. A network containing measurements, such as 
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can be associated with a collection of positive operators T := {T m | m G M 
Mi x ... x Mjv} with the property that the sum over all outcomes T := 
satisfies Eq. ()3]). We call such a collection of operators a quantum tester. It is possible 
to prove that, if a collection positive operators T = {T m \ m G M} is a quantum tester, 
then there exists a quantum network of the form 
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such that T is the tester associated to that network [271 ES [26] . Note that here the 
measurement takes place only in the last step, while the boxes C^- Sn \ n — 1. . . . , N — 1 
represent quantum channels. 
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A particular type of testers are those where the first and last quantum systems are 



trivial [Ut l] ^ n 



/(sn) 
out 



C in Eq. 



These testers represent quantum networks that 



start with a state preparation and end with a POVM measurement. These are exactly 
the networks that are interesting for the estimation of quantum processes, as depicted in 
Eq. (TjQ): note that to test a process consisting of N time steps we need tester consisting 
of iV + 1 time steps. Labelling the Hilbert spaces as in the following diagram 
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the normalization of the tester T becomes 
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5.5. Generalized Born rule. 
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If we test a process represented by the quantum comb i? G Comb (^), 
with a network represented by the tester T := {T m | m G M}, then we obtain a 
probability distribution p(m\R^ N ' ) ) over all possible outcomes, given by the generalized 
Born rule [251 12S] 

p(m\R)=Tr[T m R}. (7) 

Here the quantum comb R plays the role of the density matrix in the ordinary Born 
rule, and the tester {T m | m G M} plays the role of the POVM measurement. In fact, 
the ordinary Born rule is a very special case of Eq. (|7J), corresponding to the case of 
state preparation processes, which consist of a single time step (N = 1) with no input 
system (H^ ~ C). 



4. The optimization problem of Quantum Metrology 

In process estimation one has a parametric family of processes with a given input-output 
structure and with a fixed number of time steps iV labelled by an index s G S C N . 
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where 
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jj ir(x) the probability that 



Each process is described by a quantum comb R x G Comb 
x G X is the parameter to be estimated. Let us denote 
the unknown parameter has the value x. If x has a continuum of values, p(x) will 
represent the probability density of x with respect to some measure x. For simplicity in 
the following we will present the results in the discrete case, but it is important to bear 
in mind that these results hold also in the continuous case, just replacing sums with 
integrals and replacing the quantifier "Vx G X" with "Vx G X except at most for a set 
of zero measure" . 
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4-1. Primal maximization problem 

For an estimation strategy described by the quantum tester T := {T x \ x G X}, the 
probability distribution p(x\x) is given by Eq. (I7j). In order to evaluate the performance 
of a given strategy, we introduce a payoff function g(x, x), which quantifies the gain (or 
the loss) obtained by estimating x when the actual value is x. In general, as long as the 
payoff is lower bounded (that is, as long as there is a limit to the losses) one can choose 
without loss of generality g(x,x) > 0, Vx,x G X. The expected payoff, averaged over 
the possible true values, is then given by 

^[ T ] := ^2 n ( x )^2 g(x,x)p(x\x) 

= Tr [ T ^] G * = < x ) x ) R *- ( 8 ) 

fgx xex 
An example of payoff function is g(x, x) = S X}X , which gives a unit gain if and only 
if the estimated value x coincides with the true value x. In this case the average gain 
coincides with the average probability of guessing the correct value 

7 [T] = p succ := 22 Tf(x)p(x\x). 

A tester T is optimal if it achieves the maximum payoff, defined as 
7 max := max 7 [T] 

T,H( 1 ),...,H( JV ) 

T x > 0, \Jx G X 

ZjfeX 1 ^ — J-out,s N ^ ^ 
Tr, rw(A01 = J , 55 CT(iV-l) 

Trj niSl [1 — ^] 1. 

Dwa/ minimization problem 

Maximizing the payoff in Eq. (Q is a semidefinite program. Using duality theory we 
now give a useful expression for the maximum payoff: 



Theorem 1 The maximum payoff is given by 
7max = min <.X > | 3R G Comb 
where G x is defined as in Eq. (05)]. 



(8) (nii ® w£ J 
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: XR > G x , Vx G X 



The proof of the theorem, given in the Appendix, follows the same lines used 
by Gutoski [22] to prove strong duality for the minimum error discrimination of 
two quantum processes, which the special instance of our problem corresponding to 
X := {0, 1} and g(x, x) = 5 XtX . Here we illustrate the result of theorem[T]in a few special 
examples. 
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4-3. Examples 

4-3.1. State estimation. State estimation can be viewed as a special case where the 
unknown process V x to be estimated consists only in the preparation of a quantum 
state p x G Lin + ("H) (that is, when there is only one time step N = 1, the output Hilbert 
space is l-L^t = % an d the input Hilbert space is trivial ~ C ). In this case, the 
expression (jUJ) becomes 

7max = min {A > | Bp G St {%) : Xp > G&, Vx G X} , (10) 
with G x = Y.xex 71 ^) 9(x,x) p x . 

4-3.2. Minimum error state discrimination. If g(x,x) = 5 X)X1 the maximum payoff 
Imax coincides with the maximum probability of guessing the correct value p™cc> so that 
maximizing the payoff is equivalent to minimizing the error probability In this special 
case we retrieve from Eq. ffTOj) the classic expression by Yuen, Kennedy, and Lax [28] 
(see also [301 EI]) 

pZ7c = min{Tr[A] | A G Lin(H), A > n xPx ,\/x G X} (11) 
[the above expression follows from Eq. (TIOl) with the definition A := Xp]. 

4-3.3. State estimation/ discrimination in the group covariant case. The dual 
expression for the maximum payoff has an interesting interpretation in the presence of 
symmetry. Let us first consider a simple case of state discrimination, where X is a finite 
group, the prior probability 7r is uniform, that is, n(x) = 1/|X|, and the unknown state p x 
is given by p x = U x p U%., where p G St(%) is a fixed state and U : X — > Lin('H), x i-> U x 
is a unitary representation of the group X. In this case, it is easy to show that the 
minimization over A = Xp in Eq. ( TTTT) can be restricted without loss of generality to 
invariant states, satisfying XJ x pU\. = p, Wx G X. Hence, we have 

p™* = min < A | 3p G St(H) : p is invariant, p > /h ' 



A|X| 



1 



(12) 



| X | (/max 

9max : = max{g | 3p G St(7i) : p is invariant, qp < p} 

By definition, g max is the maximum probability that po can have in an ensemble 
decomposition of an invariant state p, optimized over all possible invariant states. The 
probability g max ranges between 1/|X| and 1. Intuitively, it can be interpreted as a 
measure of how symmetric is the state po~- for g max = 1 the state po is invariant, while 
for g max = 1/|X| the state p generates a family of orthogonal states p x = U x p U}.. 

The result can be easily extended to the case of arbitrary payoff functions that are 
left-invariant under the action of the group, that is, functions g satisfying the condition 
g(yx, yx) = g(x, x), Wx, x, y G X. Moreover, the expression of Eq. ( [12]) can be generalized 
to a form that holds also for continuous groups: 
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Corollary 1 Let X be a compact group, g : X x X — >■ R be a left-invariant payoff 
function, and p x be the quantum state p x := U x p Ul, where U : x U x is a unitary 
representation of the group X. If the prior probability is given by the Haar measure x, 
then the maximum average payoff over all quantum measurements is given by 

7ma X = 7o := / xg(e,x) 

9max := max {q \ 3p G St(%) : p is invariant, qo~o < p} 

no := — / xg{e,x) U x p Ul, 
7o Jx 

where e G X denotes the identity element in the group X. 

Proof. Using the invariance of the Haar measure and of the payoff function it is 
easy to check that G x = U$(yo<To)Ut. Using this fact, we can restrict the minimization 
in Eq. ( TTUj) to invariant states p satisfying the condition Xp > yocr . Finally, defining 
Q := 7o/A we can transform the minimization over A into a maximization over q, thus 
proving the thesis. ■ 



4-3.4- Binary discrimination of multi-time quantum processes The discrimination of 
two multi-time processes Vq and V\ corresponds to the special case where X = {0, 1}. In 
this case, the maximum probability of successful discrimination defines an operational 
norm in the real vector space generated by quantum processes [3H [35]. For prior 
probabilities 7r and tti , the probability of success and the norm are linked by the relation 

m 

Psucc = ~ (1 + \WoVo - -KlVllap) , 

which generalizes the well-known expression by Helstrom [16J for the optimal 
discrimination between two quantum states. In the binary case the dual expression for 
the maximum success probability given by theorem [T] coincides with the dual expression 
presented by Gutoski in Ref. [35] . 

4-3.5. Process estimation/ discrimination in the group covariant case. Consider the 
case of a general process V x consisting of iV time steps. Suppose that V x has the 
form V x = (® seS Vi s) ) V ((8> se sWi s)t ) , where V is a fixed process and U x s) \p) : = 

Ux pUx Vx\p) '■= V x s ^ pV x s ^ is a unitary quantum channel representing the action 
of the group on the input (output) system at the s-th time step. 

Denoting by R x and Rq the quantum combs corresponding to the processes V x and 
Vo, it is possible to show that R x = (^S> se s^^ (Rq) where U x ^* denotes the 

complex conjugate U x s ^* with respect the computational basis [32] . 

The result of Corollary [1] can then be generalized immediately to the case of general 
processes: 



A product rule for Quantum Metrology 



10 



Corollary 2 Let X be a compact group, g : X x X — >■ R 6e a left-invariant payoff 
function, and let p x be the quantum state p x := U x poU^, where U : x i— >■ U x is a unitary 
representation of the group X. If the prior probability is given by the Haar measure x, 
then the maximum average payoff over all quantum measurements is given by 



7o 

Tmax 

Q'max 




where e £ X denotes the identity element in the group X. 
Proof. Same proof as for corollary 1. ■ 

5. Product rule for the estimation of independent processes 

Imagine that we have K processes, where each process Vk, Xk corresponds to a quantum 
network as in figure ((2|) and is labelled by an unknown parameter Xk in some set X^, 
k = 1, . . . , K. For every fixed k, all the processes {Pk,x k \ Xk £ X^} consist of the same 
number of time steps, which we label by an index Sk in some set C N. At time 
Ski eacn process Vk )Xk will transform an input system with Hilbert space Ti^^l, into an 
output system with Hilbert space Hk out- 
Let us denote by x the vectors of parameters x := (xi, . . . , %) £ X := Xi x • • • X X^. 
We say that the K processes {Vk, Xk \ k = 1, . . . , K} are independent when 

• two processes Vk, Xk = Vi lXl with k ^ I correspond to two disconnected quantum 
networks for every Xk £ X& and for every x\ £ X; 

• the prior distribution of the parameters factorizes as 

7T (x) = 7ri(xi)7T 2 (Xi) • • ■7T K (X K ), (13) 

where i^k is the prior distribution for the parameter Xk- 

For example, the different parameters could be K independent and uniformly 
distributed phase shifts. 

If {Pk,x h I k = 1,...,K} are K independent processes, we denote by Px := 
"Pi^i ® ^2, X 2 ® " ' ' ® ^K,x K ^ ne corresponding joint process. 

Suppose that we want to estimate parameter x labelling the joint process P x and 
that our figure of merit is given by the payoff function g(x, x). If we are interested in 
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each parameter independently, then the payoff function for the estimation of the vector 
x is the product of the payoff functions for the estimation of its components: 

K 

g-(x,x) = Y[g k (xk,x k ) g k >Q,Vk = l,...,K, (14) 
k=i 

where the notation g k > means g(x k ,x k ) > 0,Vx k ,x k G X^. For example, the payoff 
function could give a reward only when all the parameters are guessed correctly, so that 
#(x,x) = 5 XjX = n£=i S & k ,x h - 

Note that, in order to have a meaningful figure of merit for the estimation of the 
vector x, it is important to have g n > for every n: otherwise, the product of two 
negative gains (i.e. of two losses) for two different parameters would count as a positive 
gain for the joint estimation of the vector x. 

Based on the hypotheses of independence of the processes and on the product form 
of the payoff function we can prove the following theorem: 

Theorem 2 (Product rule for the estimation of K independent processes) 

Let Vk,x k , k = 1, . . . , K be K independent processes, each process labelled by an unknown 
parameter x k G X k with prior probability 7i k (x k ). Then for a payoff function g(x, x) of 
the product form of Eq. [Lfy the maximum payoff for the estimation of x is given by 
the product of the maximum payoffs for the the estimation of its components: 

K 

7max = J^7max,fc, (15) 
fc=l 

where 7 max ,fc is the maximum payoff achievable in the estimation of x k . 

In other words, the optimal estimation of the vector x can be achieved by estimating 
each component x k independently. 

Proof. Clearly, we have 7 max > n^Li 7max,fc, because restricting to product 
strategies can only reduce the maximum payoff. To prove the converse we use the 
dual minimization problem of Theorem [T], in which restricting to product combs can 
only increase the minimum. 

Let R k , Xk be the quantum comb representing the process V kjXk and let _R X = 
<S> k= i Rk,x k be the quantum comb representing the process V* = <S> k= iV ktXk . Let us 
introduce the notation 



Cfc := Comb 



Comb 



(g) ® n: 



\s k £S k 
K 



c 



prod 



k=l s k inS k 

Ir = R k | R k G C fc Vfc = 1, . . . , k\ c C 



With this notation we have that R k , Xk and _R X belong to C k and C, respectively. 
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Define the positive operators 

Gk,x k '■= ^k( x k) 9k( x kj x k) Rk,x k 

x k £X k 

K 

G x := ^2 7r ( x ) x) i? x = (g) G k/Xk . 
xex fe=i 

Then, by theorem [1] we have 

7 max = min {A > | 3R G C : XR > G x , Vx G X} 

< min {A > | 3R G C prod : XR > G x , Vx G X} 

K 

< Y[ min {A fc > | 3R k G C fc : X k R k > G k>Xk , \/x k G X fc } 

k=l 
K 

k=l 

Here, the second inequality comes from the fact that if X k R k > G kyXk for all k, then 
XR > G x for A = Uk X k and R = (g) fc ■ 

5.0.5. Relation with the product rules by Mittal and Szegedy. The technique used to 
prove that the optimal payoff is of the product form is directly inspired by a result by 
Mittal and Szegedy on product rules for semidefinite programming [23J. However, our 
result is not a direct application of the theorem in Ref. [23], which concerns product 
programs, where the linear constraint for the product program is the tensor product of 
the linear constraints for the individual programs. The theorem is not directly applicable 
in our case because in the joint estimation of K processes the linear constraint of Eq. 
(Q are not the tensor product of the linear constraints for the estimation each process 
separately. However, the crucial point here is that the tensor product of K operators 
satisfying the constraints individually is an operator that satisfies the joint constraint 
and that this property is true both in the primal maximization problem and in the dual 
minimization program. 

5.0. 1. Example 5: minimum error discrimination of K sets of processes Theorem[2]can 
be applied to the case of minimum error discrimination of processes. Suppose that for 
every k — 1, . . . , K we have a set of processes {V k Xk \ x k G X^}, each process V k Xk having 
prior probability 7r fc x . fc (XL fc ex fc ^K^h = Denoting by p™cc,k ^ ne maximum probability 
of success in correctly identifying the k-the process, and by pf^cc the probability of 
success in correctly identifying all processes, we then have p max = p^f^ x • • -pfuc^K- The 
best joint strategy for discrimination is just the product of the best individual strategies. 

5.1. Counterexamples 

Our theorem [2] proved the optimality of product strategies in the hypotheses that the 
processes are independent and that the payoff function is of a product form. Here we 
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show that if these hypotheses are dropped, the result may not hold. 

5.1.1. Minimum error discrimination of two pure states with multiple copies. Consider 
the minimum error discrimination of two pure states {po, p} with prior probabilities 
{po,Pi}, in the case where K identical copies of the unknown state are available. We 
can view this problem as an instance of minimum error discrimination of K perfectly 
correlated preparation processes, each of which prepares one of the states {p ,Pi}- 
Clearly, denoting by pf^ c (K) the probability of success with K copies, we have that 
pf^ c (K) converges to 1 exponentially fast in the limit K — > oo [33J. On the other 
hand, the product of the probabilities of success, given by \j$^(K = 1)] K tends to zero 
(exponentially fast) unless the two states are perfectly distinguishable. 

5.1.2. Estimation of two independent phase shifts with a correlated payoff function. 
Consider the estimation of two independent phase shifts on two qubit systems, with 
Hilbert spaces Hi and H 2 , respectively (Hi ~ "H 2 — C 2 ). Denoting by |0) and |1) the 
two orthonormal vectors in the standard basis in C 2 , the phase shifts on a qubit system 
are given by U x = |0)(0| + e i:c |l)(l|, x G [0, 2tt). We assume that the phase shifts on the 
two qubits are uniformly distributed according to the Haar measure x/27r. The problem 
is then to find the best estimate of the unknown parameter x := (xi,x 2 ) characterizing 
the black boxes U X1 and U X2 . As a figure of merit, we consider the maximization of the 
payoff function 

# p (x,x) = p cos(i:i + x 2 — x x — x 2 ) + (1—p) cos(xi - x 2 — Xi + x 2 ), 

for some p G [0,1]. Note that g p is a convex combination of the figure of merit 
cos(xi + x 2 — Xi —x 2 ), which quantifies how good is our estimate of the sum s := x\ +x 2 , 
and of the figure of merit cos(a;i — x 2 — X\ +x 2 ), which quantifies how good is our estimate 
to compute the difference d := xi —x 2 . In other words, we can interpret / as expressing 
the fact that, with probability p, we will be asked to estimate the sum, while with 
probability (1—p) we will be asked to estimate the difference. 

Due to the symmetry of the problem, is is enough to consider quantum networks 
where the two unknown phase shifts are applied in parallel on a suitable entangled 
state \E) G Hi <8> H 2 , as proven in Ref. [31]. No additional reference system is needed, 
because the black boxes form a unitary representation of an abelian group [36J. Hence, 
the problem is reduced to the optimal estimation of x from from the output state 
\E X ) :=(U X1 ®U X2 )\E). 

From the theory of optimal estimation of group parameters [36] we know that the 
optimal measurement is given by the covariant POVM 

Px = (U X1 g) U X2 )\rj)( V \(U Xl ® f/, 2 )t \ V ) ■= |0)|0) + |0)|1) + |1)|0) + |1)|1). 

Incidentally, we note that the POVM is of the product form P x = Pi jXl ® P2,x 2 - By 
direct calculation, we then find that the average value of g p is 7 P = (E\G P \E) with 

G P = |(|0)|0)(1|(1| + |1)|1)(0|(0|) + ^(|0)|1)(1|(0| + |1)|0)(0|(1|). 
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Clearly, the maximum eigenvalue of G p is A max = max{p/2, (1 —p)/2}, corresponding 
to the nondegenerate eigenvector \E) = 2~^(|0)|0) + |1)|1)) for p > 1/2 and \E) = 
2~5(|0)|1) + |1)|0)) for p < 1/2. For p = 1/2 one has degeneration, and the optimal 
input state can be chosen of the product form \E) = |+)|+) with |+) = 2^2 ( |0) + |1)). 

The qualitative explanation of the behaviour is the following: For p — 1/2 the 
figure of merit is factorized (gi = cos(0 — (p) cos(ip — ip)) and the estimation strategy 
is factorized too. For every value p ^ |, the degeneration is removed and suddenly 
the optimal input state becomes maximally entangled. Note, however, that there is no 
discontinuity in the average payoff. 



5.1.3. Estimating the sum of K independent phase shifts. Suppose that we have 
K identical systems, with Hilbert spaces Hk ~ for all k = 1,...,K, and 
suppose that each system undergoes an independent phase shift Xjj$ := e tXkH{h \ where 
jj(k) ._ ^2^ =1 n \n)(n\ for every k, {\n)} being the computational basis. 

If we want to estimate the sum s := ^2 k x k a natural figure of merit is the cost 
function c(s,s) = 2[1 — cos(s — s)]. This cost function is well known in the phase 
estimation literature as a smooth and periodic version of the variance [HI El HI 12] . 
For small s, we have indeed c(s,s) ~ (s — s) 2 . Clearly, minimizing c is equivalent to 
maximizing the payoff function g(s, s) — 1 + cos(s — s). 

Let us find the optimal estimation strategy. First, using the fact that the unknown 
black boxes form a unitary representation of an abelian group, we know that the optimal 
strategy consists in applying the black boxes in parallel on an entangled input state 
\E) e U® K . Moreover, note that for every fixed i and j, if we apply the transformation 
Xi i — y Xi + £, Xi i — y + Xj i — y Xj—^, Xj >->• Xj — ^ G [0, 2n), then the value of the figure 
of merit does not change. Using this symmetry it is easy to show that the input state \E) 
must be an eigenstate of the difference operator Ay = —H^' for every possible pair 
i,j. It is then straightforward that the optimal choice is \E) = Yl n =i e n\ n )® K where 
{e n } are suitable coefficients. The problem then becomes to estimate the sum s from 
the state \E X ) := (j^[ k U^J \E) = J2n=i G LSn e n \n)® K . From the theory of optimal phase 

estimation we know that the minimum cost is c m i n = 4 sin 2 [^1 , which converges to 
in the limit iV — > oo (see Ref. [1]). The corresponding optimal state is the entangled 
state [1] 

l S ort)= J Z^ sm 

and the optimal POVM is P s = \r] s )(i] s \, \r) s ) := Yln=i z lsn \n)® K ■ It is easy to see that the 
use of entanglement implies an advantage over factorized strategies, where each system 
is prepared independently in a state \e^) and is measured independently with the optimal 
POVM. Indeed, if we choose the optimal states |ejt) = |e) := (y) 2 YlH=i srn ^n-'i 
and the optimal product POVM P k := ]J k C/"£ ) (2|+)(+|)C/£ )t then we obtain the cost 
(c(s,s)) = 2(l-(cos(s-s))) 



n(n — 1) 



\n) 
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= 2(l- JJ(cos(£ fe - x k ))j 

where (/) denotes the expectation value of the function /. For large N we get 
the asymptotic expression (c) ~ From the comparison with the optimal value 

2 

Cmin ~ jfi we note that entangling K systems and performing a joint measurement 
implies a reduction of the variance of a factor K in the estimation of the sum. 

6. Conclusions 

In this paper we addressed the estimation of an unknown quantum process that can 
possibly consist of a finite number of time steps. We formulated the search of the 
optimal quantum network for estimation as a semidefinite program, and used duality 
theory to give an alternative expression of the maximum payoff achieved by the optimal 
network. Using this result we proved a product rule for quantum metrology, showing 
that the individual strategies are sufficient to achieve the optimal joint estimate of a set 
of independent processes. In particular, the probability of success in the discrimination 
of K sets of processes is the product of the probabilities of success for each set. 

It is easy to see that the product rule established here for joint estimation can also 
be generalized to the optimization of quantum networks for other tasks, such as the 
optimal cloning of independent sets of states. For example, in the case of cloning the 
product rule shows that the maximum fidelity for the joint cloning of K sets of states 
is the product of the maximum fidelities for each set, so that the optimal joint doner is 
the product of the optimal individual doners. 
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Appendix 

Proof of theorem Q]. Define the block diagonal matrices T := ^©^ =1 J © 

(©xex^) and G = (0n=i°n) © (©xex^V where °< denotes the zero matrix 
in the z-th block. With these definitions, the optimization problem in Eq. (Q can be 
written as a semidefinite program in the standard form 
7 max = max T Tr [TG] 

subject to T > (16) 

C{T) = K 
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where C is the Hermitian-preserving linear map defined by C{T) = 0^ =o i#) with 

^ Tr j TtjS1 ^] 
R {1) =Tr m , S2 [S( 2 )]-/ ouMl ®S( 1 ) 



fl(JV-l) T p(JV)l _ j ^ W (N-1) 

AL J -- L m,SjvL J ± out,SN-i <> '— ' 



and If is the block diagonal operator if := 0^L O if ^ defined by if ( ) = 1 and = Oj 
for every j = 1, . . . , N. 

Using the duality of semidefinite programming we obtain 

7max < 7* := min Tr[SK] (17) 

subject to £ f (S) > G, 

where S = ®f =0 S^ and £t is the dual map defined by (S,C(T)) = (C^{S),T) 
with (S,T) := Tr^T] is the Hilbert-Schimdt product. Using the definition of C\ it is 
easy to check that C\S) = (0^ =1 M n ) © (0 KG x where 

M 2 =i in , 82 (8>S ,(1) - , B: 0ttt)82 [S ,(2) ] 

M w = i m , Sjv ® S^- 1 ) - TV^^f^W] 

M x = S iN) VxGX 

Recalling the definition of if and G, the expression for 7* becomes 

7* = min 5 

subject to I in , ai SW > Tr outySl [S™] 

Iin, S2 ®SW >Tr out)S2 [^ 2 )] 



(18) 



W®^" 1 ) >Tr out , SN [SW] 
S (N) > G (iv) V;r e x> 



Note that must be positive, since we have > Gi iV) > 0. Consequently, 
must be positive for every j — 0, . . . , N. Moreover, there exists at least an operator S 
such that tf(S) > G. For example, one can choose 

N 

S {N) = SW TT (Iout, Sn ® im, Sn ) fiWx := max5(i,i) 



n=l 



^- 1 ) = 2Tr OMMjv Tr m , Sjv [^] 



=2Tr OMMl Tr m , Sl [^)]. 



A product rule for Quantum Metrology 



17 



The existence of an operator S such that &(S) > G, along with the fact that 
the maximum payoff 7 max is bounded by g ma , x , implies that the hypotheses of Slater's 
theorem (see e.g. |35j EZ]) on strong duality are satisfied. Hence, the optimum values 
for the primal and dual optimization problem coincide: 7 max — 7*- 

Now, we show that the first iV inequalities can be chosen to be equalities without 
loss of generality: we show that for every operator S satisfying the constraints there 
exists another operator S that achieves the equality in the first N constraints and has 
the same value of the objective function as S. To prove this statement, we proceed by 
induction. First, we define the operator S := ^2j =Q S® through the relations 
g(0) ._ s (0) 

5® :=/in, Sl S (0) -TW[S (1) ] >0 

g(j). = S U) Vj = 2,...,iV 

where p\ is an arbitrary quantum state in St("H out)Sl ). Clearly, with this definition we 
have Tr olttjSl [S^] = I intS1 S^°\ that is, S achieves the equality in the first constraint. 
Moreover, since 5^ is positive we have Ii n , S2 ® > hn,s 2 ® > Tr out>S2 [S^] = 
Tr ott4iS2 [S^], namely S satisfies the second constraint. Hence, the operator 5 has the 
same objective value of S, satisfies all the constraints and achieves the equality in the 
first. Now, suppose that S achieves the equality in the first k > 1 constraints and define 



§0) 


:= S {j) Vj 


= l,...,fc 






._ j. q(k) . 

1 m,s k+1 >~> 




[S {k+1) ] > 


Q{k+1) 




hl ®5 (fc+1) , 




jgV) 


:= S {j) Vj 


= k + 2,.. 


.,N 



where p k +i is an arbitrary quantum state in St{T-L outjSk+1 ) . With this definition it is 
immediate to see that S has the same objective value of S, satisfies all constraints and 
achieves the equality in the first k + 1 ones. By induction, we conclude that for every 
operator S satisfying the constraints there exists another operator S which achieves the 
equality in the first iV constraints and has the same objective value. Defining A := S^> 
and R := S^ N > /A we then obtain the thesis of the theorem. ■ 
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